Why the phylogenetic regression appears robust to tree misspecification.
نویسنده
چکیده
The phylogenetic comparative method uses estimates of evolutionary relationships to explicitly model the covariance structure of interspecific data. By accounting for common ancestry, the coevolution between 2 or more traits, as a response to one another or to environmental variables, can be studied without confounding similarities due to identity by descent. Because the true phylogeny is unknowable, an estimate must be used, introducing a source of error into phylogenetic comparative analysis that can be difficult to quantify. This manuscript aims to elucidate how tree misspecification is propagated through a comparative analysis. I focus on the phylogenetic regression under a Brownian motion model of evolution and consider the effect of local phylogenetic perturbations on the regression fit. Motivated by Felsenstein's method of independent contrasts, I derive a matrix square root of the phylogenetic covariance matrix that has an obvious phylogenetic interpretation. I use this result to transform the perturbed phylogenetic regression model into an ordinary linear regression in which one interpretable point has been affected. The simplicity of this formulation allows the contributions of data and phylogeny to be disentangled when studying the effect of tree misspecification. Consequentially, I find that branch length misspecification can be easily explained in terms of the reweighting of contrast scores between subtrees. An analytical consideration of this and other perturbations helps to explain why the phylogenetic regression appears generally to be robust to tree misspecification, and I am able to identify conditions under which the regression may not yield robust results. I discuss why soft polytomies do not meet these problematic conditions, leading to the conclusion that unresolved bifurcations should have only modest effects on the regression fit.
منابع مشابه
Evidence of Statistical Inconsistency of Phylogenetic Methods in the Presence of Multiple Sequence Alignment Uncertainty
Evolutionary studies usually use a two-step process to investigate sequence data. Step one estimates a multiple sequence alignment (MSA) and step two applies phylogenetic methods to ask evolutionary questions of that MSA. Modern phylogenetic methods infer evolutionary parameters using maximum likelihood or Bayesian inference, mediated by a probabilistic substitution model that describes sequenc...
متن کاملA Guide to Phylogenetic Reconstruction Using Heterogeneous Models - A Case Study from the Root of the Placental Mammal Tree
There are numerous phylogenetic reconstruction methods and models available—but which should you use and why? Important considerations in phylogenetic analyses include data quality, structure, signal, alignment length and sampling. If poorly modelled, variation in rates of change across proteins and across lineages can lead to incorrect phylogeny reconstruction which can then lead to downstream...
متن کاملAcknowledging Misspecification in Macroeconomic Theory
We explore methods for confronting model misspecification in macroeconomics. We construct dynamic equilibria in which private agents and policy makers recognize that models are approximations. We explore two generalizations of rational expectations equilibria. In one of these equilibria, decision makers use dynamic evolution equations that are imperfect statistical approximations, and in the ot...
متن کاملFrequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models.
What does the posterior probability of a phylogenetic tree mean?This simulation study shows that Bayesian posterior probabilities have the meaning that is typically ascribed to them; the posterior probability of a tree is the probability that the tree is correct, assuming that the model is correct. At the same time, the Bayesian method can be sensitive to model misspecification, and the sensiti...
متن کاملDetermining Difference in Evolutionary Variation of Bacterial RecA proteins vs 16SrRNA Genes by using 16s_Toxonomy Tree
Background and Aims: The rate of variation in various genes of a bacterial species is different during evolution. Therefore, in systematic bacterial studies many researchers compare the phylogenetic tree of a particular gene to the standard tree of an rRNA gene. Regarding the importance of 16SrRNA gene and the evolutional process of RecA protein family, we investigated the changes in the select...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Systematic biology
دوره 60 3 شماره
صفحات -
تاریخ انتشار 2011